Smart Computer Aided Translation Environment - SCATE
نویسندگان
چکیده
We aim at improving the translators' efficiency through five different scientific objectives. Concerning improvements in translation technology, we are investigating syntax-based fuzzy matching in which we estimate similarity based on syntactic edit distance or similar measures. We are working on syntax-based MT using synchronous tree substitution grammars induced from parallel node-aligned treebanks, and are building a decoder to use these grammars in translation. Concerning improvements in evaluation of computer-aided translation, we have developed a taxonomy of typical MT errors and are constructing a manually annotated corpus of 3000 segments of Google Translate MT errors. Post-editing behaviour of translators is being monitored. Concerning improvements in automated terminology extraction from comparable corpora, we have developed C-BiLDA, a multilingual topic model. It does not assume linked documents to have identical topic distributions. On the task of cross-lingual document categorization, we trained it on a comparable corpus of Wikipedia documents, and inferred cross-lingual document representations on a dataset for document categorization. The document representations and category labels are fed to an SVM classifier: we train on the source language and predict the labels for the target language documents. C-BiLDA outperforms the state-of-the-art in multilingual topic modeling. Concerning improvements in speech recognition accuracy, we clustered words by their translations in multiple languages. If words share a translation in many languages, they are considered synonyms. By adding context and by filtering out those that do not belong to the same part of speech, we find meaningful word clusters to incorporate into a language model. We found no improvements, and attribute this in part to errors made by the MT system and to the incorporation technique (hard clustered class-based n-grams). We will take context into account during evaluation and/or further improve the word clusters by using the translations as features in vector space modeling techniques. Concerning improvements in work flows and personalised user interfaces, we reviewed existing translation systems, and created an inventory of the various features and configuration options of the systems. Six Flemish companies are interviewed regarding their practices and their vision for future CAT tools. A worldwide survey has been conducted with more than 135 responses. Detailed analyses of translators' practices have been conducted by observing more than 7 translators by conducting a contextual inquiry. In the upcoming period, the results of the different studies will be analysed in order to obtain a model of how CAT tools can support workflows for specific translators. This model will be used as a base for the personalised visualisations as part of interfaces for translation work. In contrast with traditional engineering approaches, this model will also be usable by translators as part of the configuration of their personal CAT tool.
منابع مشابه
Recommendations for Translation Environments to Improve Translators’ Workflows
Language professionals play an important role in an increasingly multilingual society where people commonly do not sufficiently understand all languages used in their environment. While there are many translation environment tools (TEnTs) available to support translators in their tasks, there is evidence that these tools are not used to their full potential. Within the context of a broad resear...
متن کاملTHE IMPACT OF USING COMPUTER-AIDED ARGUMENT MAPPING (CAAM) ON THE IMPROVEMENT OF IRANIAN EFL LEARNERS’ WRITING SELF-REGULATION
The present study was conducted to investigate the impact of using computer-aided argument mapping (CAAM) on the improvement of Iranian learners’ writing self-regulation. To this end, 90 participants out of 127 senior university students in English translation were selected after administrating language proficiency test, as well as an essay writing test for the purpose of homogenizing the learn...
متن کاملModelsaz: An Object-Oriented Computer-Aided Modeling Environment
Modeling and simulation of processing plants are widely used in industry. Construction of a mathematical model for a plant is a time-consuming and error-prone task. In light of extensive advancements in computer science (both hardware and software), computers are becoming a necessary instrument in industrial activities. Many software tools for modeling, simulation and optimization of proces...
متن کاملUGENT-LT3 SCATE System for Machine Translation Quality Estimation
This paper describes the submission of the UGENT-LT3 SCATE system to the WMT15 Shared Task on Quality Estimation (QE), viz. English-Spanish word and sentence-level QE. We conceived QE as a supervised Machine Learning (ML) problem and designed additional features and combined these with the baseline feature set to estimate quality. The sentence-level QE system re-uses the word level predictions ...
متن کاملPhysical Layer Characterization of Smart-antenna Equipped Mobile Ad-hoc Network Nodes in an Urban Environment
Mobile ad-hoc networks in which each network node is equipped with a smart antenna offers connectivity and high capacity in dynamic and complex battlefield environments. Characterizing the performance of these networks, however; is challenging due to the complex relationship between the propagation environment and node location. Consequently, performance predictions for mobile ad-hoc networks o...
متن کامل